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(54) Report generating system 



(57) A method of and system for generating reports 
from a database generate a master report from the da- 
tabase and from that master report generate one or 
more sub-reports including a proportion of the master 
report based on predefined criteria. The preferred em- 
bodiment provides a method and system for producing 
personalised electronic reports for access over a com- 
puter network. After generating an electronic report in 
an open document format (for example, HTML, RTF, 
PDF), commonly produced by a desktop reporting or 
'business intelligence' application, the system analyses 
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the logical layout of that report data and then, by com- 
parison with a database of recipient users and their at- 
tributes, the system outputs a subset of the original re- 
port personalised to the profile of each recipient. This 
subsetting or 'bursting' is carried out without recourse to 
regeneration of the whole report or any part thereof from 
Its source data. Furthemnore, as the structure and con- 
tent of the output sets is understood, the system pro- 
vides a facility to insert hyperlinks automatically so as 
to associate the sub-components of each subset togeth- 
er in order to allow the recipient to navigate more easily 
between related data items. 
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Description 

[0001] The present invention relates to a report gen- 
erating system, able to generate reports from data held 
in a database. 

[0002] Most organisations have a need to extract data 
and to produce reports from ever-increasing amounts of 
data captured and held in their transaction-processing 
operational systems. One of the most popular ways to 
do this Is to use end-user query and reporting tools (EU- 
QR) such as those from Brio, Business Objects, Cog- 
nos. or Microstrategy. The perceived advantages of 
these tools are that they abstract the technical complex- 
ity of the underlying database schema to allow non-tech- 
nical end users to create reports without enlisting the 
Involvement of scarce IT resource. 
[0003] One common drawback of end-user analysis 
is the machine load it places on the back-end systems. 
Data warehouses were built with the idea of alleviating 
this and to provide a consolidated uniform view of a com- 
pany's data at a given point in time. However, the typical 
queries generated by a business analyst exploring data 
are still hardware-intensive and in most organisations 
this facility has to be restrictedto a small number of peo- 
ple. 

[0004] There is a growing realisation that many of the 
reports produced by such people, sometimes called 
power users or knowledge workers, would be of great 
value to a much wider audience of infonnation consum- 
ers within the organisation, such as line managers who 
would all benefit from "human resources" oriented re- 
ports, sales advisors who would benefit from marketing 
campaign and sales data, and branch managers who 
would benefit from sales and returns infomnation and so 
on. 

[0005] Due to the resource intensive nature of busi- 
ness intelligence queries it is not practical to allow these 
information consumers to refresh their reports on de- 
mand against the database as this would put an unac- 
ceptable loading on the system and result in significant 
and serious degradation of system response times. 
However, as the vast majority of information consumers 
have a predictable infonnation requirement {for exam- 
ple, a branch manager wants to see the same informa- 
tion on a dally basis about his sales and returns), It is 
possible to pre-prepare their Infomaatlon for them at a 
suitable time (for example ovemighl). This batch report- 
ing approach also has restrictions, when a report has to 
be refreshed individually against the database for each 
intended recipient to ensure that each user receives a 
personalised view of information. Consider a retailer 
with 500 outlets, where each branch manager requires 
a view of the report that only displays information about 
his/her store sales. If it takes 30 seconds to refresh the 
report against the database for one outlet, then it will 
take over 4 hours to produce the output for every store. 
If there are 6 reports that need to be supplied to each 
branch on a daily basis, then there are not enough hours 



in the day to publish the required infonnation. 
[0006] The present invention seeks to provide an im- 
proved report generating system. 
[0007] According to an aspect of the present inven- 
5 tion, there is provided a method of generating reports 
from a database including the step of generating a mas- 
ter report from the database and generating from the 
master report one or more sub-reports including a pro- 
portion of the master report based on predefined criteria. 
10 [0008] The master report is preferably stored in com- 
puter memory. This enables generation of sub-reports 
without the need to query the database again and there- 
fore can save considerable processing time. 
[0009] Advantageously, the master report is provided 
15 with a plurality data units each having associated there- 
with a marker, the sub-report or reports being generated 
by obtaining data units having markers matching or as- 
sociated with said predefined criteria. 
[0010] In the preferred embodiment, an index of mark- 
20 ers is created, said Index being scanned for matches 
with said predetermined criteria. 

[0011] The predefined criteria are preferably stored in 
memory and accessed to generate automatically sub- 
reports. 

25 [0012] The report is preferably stored as a standard 
format, such as Hypertext Markup Language (HTML), 
Rich Text Format (RTF), Portable Document Format 

(PDF). 

[0013] According to another aspect of the present in- 
30 vention. there is provided a system for generating re- 
ports from a database including means for generating a 
master report from the database, and means for gener- 
ating from the master report one or more sub-reports 
including a proportion of the master report based on pre- 
35 defined criteria. 

[0014] The preferred embodiment provides a method 
and system for producing personalised electronic re- 
ports for access over a computer network. After gener- 
ating an electronic report in an open document format 
40 (for example. Hypertext Markup Language (HTML), 
Rich Text Format (RTF), Portable Document Format 
(PDF) and so on), commonly produced by a desktop re- 
porting or 'business intelligence* application, the system 
analyses the logical layout of that report data and then, 
^5 by comparison with a database of recipient users and 
their attributes, the system outputs a subset of the orig- 
inal report personalised to the profile of each recipient. 
This subsetting or 'bursting' is carried out without re- 
course to regeneration of the whole report or any part 
50 thereof from its source data. Furthermore, as the struc- 
ture and content of the output sets is understood, the 
system provides a facility to insert hyperlinks automati- 
cally so as to associate the sub-components of each 
subset together in order to allow the recipient to navigate 
55 more easily between related data items. 

[0015] An embodiment of the present invention is de- 
scribed below, by way of example only, with reference 
to the accompanying drawings, in which: 
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Figure 1 shows examples of source reports: 
Figure 2 shows examples of output report frag- 
ments of the example source reports of Figure 1: 
Figure 3 shows an example of output for fictitious 
users John and Jane;; 

Figure 4 is a flow chart illustrating the preferred em- 
bodiment of report bursting process; and 
Figure 5 is an overview of the preferred embodi- 
ment of system. 

[001 6] The preferred system runs the report just once 
against the database and then intelligently 'bursts' the 
resultant output, so that each user receives a subset of 
the report based on his/her profile as defined in a central 
repository database. This profile would specify that a 
particular user only sees specific information, for exam- 
ple for branch number 1234. By running the report just 
once against the database^ the loading on the database 
server is dramatically reduced, with a corresponding in- 
crease in scalability to allow very large user communi- 
ties to be supported. 

[0017] Once the business intelligence report has 
been refreshed just once against the database, the large 
piece of resultant output may be analogous to an ency- 
clopaedia, wherein each entry is uniquely identified by 
name and may contain several paragraphs and/or pic- 
tures (such as tables of data and graphs). The 'bursting' 
of the report to different recipients is accomplished by 
programmatically comparing the index of encyclopaedia 
entries to a list of each recipient's interests and then cop- 
ying out the matching paragraphs to form a new docu- 
ment for each user, containing only those entries which 
match their registered interests. In the example of a 
business intelligence report, the recipients interests 
could relate to their security profile (e.g. branch=1234, 
department=sales, cost centre=4389 and so on). 
[0018] The report writing modules within the EUQR 
tools offer the facility to break the report into smaller sub- 
sections to allow sub-totals and other calculations to be 
added and also to make it easier for the user to read the 
presented information. For example, a human resourc- 
es report may be sub-sectioned by department so that 
there is one element for every department within the or- 
ganisation, and within each sub-section a table can be 
presented showing holiday entitlement for every em- 
ployee who works in that department. 
[0019] When a document is created the user needs 
to define which of the available profiles should be ap- 
plied to personalise the information for the intended re- 
cipients (for example, Department). If the user creates 
one sub-section within the document for each of these 
profiles, it is possible subsequently to divide the report 
output into smaller pieces to ensure that each user only 
sees those elements of the original report to which their 
profile entitles them. 

[0020] EUQR tools, as is the norm for modern desktop 
applications, have the capability to save generated doc- 
uments to persistent disk storage in a number of differ- 
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ent machine-readable formats typically including: a pro- 
prietary binary format. Rich Text Format (.rtf), plain text 
(.txt). Portable Document Format ( pdf). and Hypertext 
Markup Language (.htm). Aside from the proprietary for- 

5 mat these are ail widely used standard formats, which 
can be read and understood by a number of different 
application programs (for example, Microsoft Word, In- 
ternet Explorer). In each case the conversion to a differ- 
ent format results in a file that, on being read and dis- 

10 played by another program, can be regarded as a visual 
likeness of the original report produced in the EUOR 
tool, differing only in the fidelity of its reproduction, ac- 
cording to the constraints and capabilities of each Indi- 
vidual fomiat. These non-proprietary output formats can 

15 all be considered suitable inputs to the 'bursting' proc- 
ess. 

[0021] Once the output is available in a non-proprie- 
tary format, the first step, as with the encyclopaedia ex- 
ample, is to produce an 'index' to represent the logical 
^0 structure of the document. The next stage is to retrieve 
from the central repository database, the list of intended 
recipients for this particular document and their individ- 
ual profiles. 

[0022] The final stage is to scan the index for every 

25 recipient's profile to determine which elements of the re- 
port they are authorised to access and then to both 
record this Information within the central repository da- 
tabase and save this indexed output set to disk storage. 
This ensure that each user accesses his/her individual 

30 view of information on request to view the report. 

[0023] The system can thus provide a repository 
(memory) that contains data about recipients/users, en- 
tered and maintained through an administrative pro- 
gram interface or automatically by interfacing to the 

35 open application programming interface (API), for burst- 
ing reports to those users based on that control data 
held in the repository. Such 'bursting' provides a method 
of running the database query once only and then intel- 
ligently segmenting the resultant output to match user 

40 requirements held in the central repository database, 
thus drastically reducing the time to create individual- 
ised reports for a mass audience. Users can be internal 
or external to an organisation and can be identified by 
name or alternattvely^by role. Users can be organised 

•^5 into groups, and profiles can be associated to groups 
and/or individual users. Each combination of a user and 
a profile can have a profile value set for it, such as for 
user Fred: Region=North. 

[0024] The reporting tool is programmatically control- 
50 led via its own published API or similar Interface to write 
out an HTML (or other non proprietary format) represen- 
tation of the report. This may constitute one or several 
physical HTML files depending on the tool and the struc- 
ture of the report. The structure of the report is derived 
55 by reference to the API of the tool and/or by parsing of 
the HTML output, and stored In the database repository. 
Of key significance is to locate in this structure the loca- 
tion of the data elements on which the recipients are pro- 
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filed, is the example given the whereabouts on the report 
of the Region name(s). In one embodiment, this struc- 
ture detail is held in XML format (extensible Markup 
Language), a format particularly suited to processing hi- 
erarchical data, so as to expedite the traversal of any 
data hierarchies and the selection of report elements 
matching certain criteria. 

[0025] XML is a method for putting structured data in 
a text file. In this description, For "structured data" is in- 
tended to relate to such things as spreadsheets, ad- 
dress books, configuration parameters, financial trans- 
actions, technical drawings, etc. Programs that produce 
such data often also store it on disk, for which they can 
use either a binary format or a text format. The latter 
allows, if necessary, looking at the data without the pro- 
gram that produced it. XML is a set of rules, guidelines, 
conventions, whatever you want to call them, for design- 
ing text formats for such data, in a way that produces 
files that are easy to generate and read (by a computer), 
that are unambiguous, and that avoid common pitfalls, 
such as lack of extensibility, lack of support for interna- 
tionalisatlon/localisation, and platform-dependency. 
[0026] The whole process is the programmatic equiv- 
alent of printing a whole slew of fanfold paper, splitting 
out the individual pages, and then by reference to a list 
of recipients interests, for example names of all regional 
managers, copying each individual page sufficient times 
to enable a bundle of pages to be stapled together and 
put in an internal post system envelope for despatch to 
the recipients. 

[0027] It is common for a report document to contain 
several different views of (subsets of) the same data, for 
example, sales details may be presented at each of Re- 
gion, Area, and Store levels (see Figure 1). By identify- 
ing common elements in each of these views it is pos- 
sible to achieve two further effects: 

A) first, it is possible to determine the hierarchical 
structure (if any) of the data in the report. For ex- 
ample, given the report in Figure 1 , it Is possible pro- 
grammatically to ascertain from the report that 
stores in Leeds and Bradford are in the Yorkshire 
Area (B), which is in turn a component part of the 
North Region (A). This has two benefits: 
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20 



25 



1) users only need profiles to be set for them at 
the highest level of information they are author- 
ised to access, so in the case of the manager 
of North Region (A), he/she only needs a profile 
setting at Region level and the tool will auto- 
matically deduce that they could also see Areas 
Yorkshire (B) and Manchester (C) and the 
stores in those areas (D); 
2) it is also possible optionally to present infor- 
mation to users that is higher up the same leg 
of the hierarchy in which they reside. Thus, 
rather than just bursting the data for Yorkshire 
Area (B) to the recipients with a profile of Ar- 



ea=Yorkshlre, the system can optionally also 
provide them with a view of the data for North 
Region (A) but not the data for any other Re- 
gions or Areas, for example they do not receive 
data for Manchester Area (C) and its compo- 
nent stores; 

8) secondly, the system can allow users to navigate 
quickly around the report by altering the HTML of 
the report output to insert hyperlinks around the 
common elements which, when invoked, cause the 
users currently viewed report fragment to be re- 
placed with another which contains data about the 
entity on which they clicked. For example, a report 
containing data for the North Region (A) may con- 
tain a table listing all the Areas contained in that re- 
gion, and by clicking on the name of an Area (that 
has been programmattcally converted Into a hyper- 
link) the user can be navigated to a separate report 
fragment containing more detailed data for that se- 
lected store. In Figure 2 such hypertinks have been 
inserted into the example report used in Figure 1 . 
In this case, clicking on the word 'Manchester" in (A) 
would cause the user's view to change to display 
the report fragment (C). A facility is also provided 
via a hyperlink (in this example the text 'back up to. .. 
') to allow the user to navigate back up to the higher- 
level information, so that the user viewing informa- 
tion at the area level would be able to link back up 
to the region level. 

[0028] Figure 4 shows an example of technical flow 
outline, which summarises the following steps, 

1. Store user details in repository database 

[0029] It is first necessary to have stored in a central 
repository database details for users, profiles, and re- 
ports in repository database. Examples of the required 
"^0 data could be: 



'John* is a User 
'Susan' is a User 
Tom' is a User 
'Jane' Is a User 
'Area' Is a Profile 

For the Profile 'Area', John has the value 'Yorkshire' 
For the Profile 'Area', Susan has the value 'Man- 
chester' 

For the Profile 'Area', Jane has the value 'Yorkshire' 
Notice that Tom' has no profile values set. 
'Weekly Sales' is a Report 

The 'Weekly Sales' Report is to be distributed to Us- 
ers John, Susan, Tom and Jane. 



30 



45 



50 



55 



[0030] This data entry can be accomplished using 
web-based program or bulk-load API. 
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6. Relate profiles to report structure 



[0031 ] A user can attach some progrann or 'plug-in* to 
the reporting tool program, using its API, and configure 
it so as to receive notification whenever a report is 
opened. The exact mechanism for this will obviously 
vary by each reporting tool but, due to conventions and 
trends in the design of such desktop applications (often 
called 'personal productivity software'), will always be 
similar. This usually takes the form of a piece of Visual 
Basic for Applications (VBA) code. (VBA is a program- 
ming environment produced by Microsoft^™) and li- 
censed to third-party software vendors to embed within 
their desktop applications in order to provide a common 
programming and event model between such tools.) 
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[0038] For each such block or subsection found, the 
system determines whether the summary level is anal- 
ogous to one of the profiles defined in the repository da- 
tabase. For example, the sections of the 'Weekly Sales* 
report are summarised at Region (A), Area (B and C) 
and Store (D) levels. Therefore the system searches the 
profile information held in the repository to see if 'Re- 
gion', 'Area', or 'Store' are defined as profiles. In this ex- 
ample there is a match for 'Area' so the system knows 
that these sections of the report can be burst out accord- 
ing to user profiles. 

7. Cross-reference profiles to users 



3. Check each report for relevance 

[0032] Once a report-open notification is received the 
repository database must be queried to see if this par- 
ticular report is one to be burst out to the pre-defined 
users. For example, report 'Weekly Sales' has just been 
opened: does a record exist for it in the repository data- 
base? 

[0033] If not then control is returned to the reporting 
tool and no further action is taken. 

4. Create list of recipients 

[0034] If the report is described in the repository then, 
by reference to other data in the repository, the system 

can generate a list of the intended recipients. 
[0035] For example, report Weekly Sales' has been 
opened and is a report to burst, so the system scans the 
repository to reveal that John, Susan, Tom and Jane are 
all intended recipients. 

5. Extract report structure 

[0036] Using the reporting tool API (usually VBA- 
based) the system gathers report structure and hierar- 
chy information and stores it in the repository database. 
The division of the report into separate blocks or sec- 
tions is of particular interest, as this will be used later as 
an indication of the profiles to be applied. 
[0037] For example, the 'Weekly Sales' report (Figure 
1) has separate blocks or subsections for the North re- 
gion (A), the Yorkshire and Manchester areas (B) and 
(C), the Leeds store (D) etc. We can also see that in the 
regional section (A), one of the data columns is entitled 
•Area*. Likewise, in (B) and (C) we can see data columns 
for 'Store'. Additionally, by examining the values in the 
'Area' and 'Store' columns we can build up a map of the 
hierarchy contained in the report, that is we can deduce 
that the Leeds store is in the Yorkshire area and the 
North region but is not part of the Manchester area. All 
this Information will be stored away in the repository da- 
tabase. 



[0039] For each profile found, the system checks 
whether any of the recipients have a value set. For ex- 
ample, the profile for 'Area' must be applied lo this re- 
20 port, which is intended for John, Susan, Tom and Jane, 
but only the first three of them have values set for this 
profile ('Yorkshire', 'Manchester', and 'Yorkshire' respec- 
tively). 

25 8. Generate list of required outputs 

[0040] From the list of recipient profile values, the sys- 
tem generates an exhaustive list of all required output 
versions or 'instances'. For example, In the illustrated 
30 report it is requires to produce three different sets of out- 
put: 

a version for Area = Yorkshire, wh ich will be for John 
and Jane 

35 a version for Area = Manchester, which will be for 
Susan 

an unrestricted version, which will be for Tom 

9. Save report as HTML 

40 

[0041] The following steps are performed here: 

a) instruct the reporting tool (again via its API) to 
save the report to^disk file(s) in HTML format 
45 b) record the location of the saved file(s) in the re- 
pository database; 

c) depending on both the tool in question and the 
structure of the report Itself one or more HTML files 
are generated. 

50 

[0042] For example, the 'Weekly Sales' report may be 
saved as four different HTML files, one for each of the 
sections (A,B,C,D). Some reporting tools will also save 
a fifth HTML file that is a kind of index to the other four. 

55 

10. Build index of HTML structure 

[0043] The system parses the HTML f ile(s) in order to 
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understand which pieces of HTML correspond to which 
sections of data (by reference to the report structure in- 
formation previously derived from the reporting tool API) 
and build in-memory XML structures to represent this. 
For example, it is possible to discern which of the four 5 
HTML files corresponds to which section (A,B,C,D) and 
whereabouts in each of those files the different data 
items are, that is the location of each data column and 
summary line within the HTML file. So in Section (A), it 
is possible to derive that the first column in the table was io 
'Area'. 

11 ■ Create output instances 

[0044] The system loops through the list of required is 
output instances and, for each one, uses the associated 
profile value Infonrtation to look-up the required ele- 
ments of HTML in the XML Index structures. A new XML 
structure is built that describes which pieces of HTML 
content make up this unique output instance and this 20 
information is saved to disk. This XML file also contains 
some metadata (data which describes other data) to de- 
scribe its contents. 

[0045] For example, the output instance to be pro- 
duced for John and Jane (Area = Yorkshire) will consist 25 
of HTML fragments representing report sections (B) and 
(D) (remember we know from the hierarchy data gar- 
nered in step 5 that the Leeds store is part of the York- 
shire Area). Therefore the system can create an XML 
file pointing to the HTML for (B) and (D), containing so 
some metadata which describes the content as being 
for the Yorkshire Area and the Leeds store, and write it 
to disk with a unique name. The name and location of 
this XML file is saved in the repository database to en- 
able later retrieval of this information on user request. 35 



13 End of Process 



[0048] Once all output instances have been produced 
in this way and their locations recorded In the repository 
database the report bursting process is at an end. 



Claims 

1 . A method of generating reports from a database in- 
cluding the step of generating a master report from 
the database and generating from the master report 
one or more sub-reports including a proportion of 
the master report based on predefined criteria. 

2. A method according to claim 1 , wherein the master 
report Is stored In computer memory. 



3. 



A method according to claim 1 or 2, wherein the 
master report is provided with a plurality data units 
each having associated therewith a marker the 
sub-report or reports being generated by obtaining 
data units having markers matching or associated 
with said predefined criteria. 

A method according to claim 3, wherein an index of 
markers is created, said index being scanned for 
matches with said predetermined criteria. 

A method according to any preceding claim, where- 
in the predefined criteria are stored in memory and 
accessed to generate automatrcally sub-reports. 



6. A method according to any preceding claim, where- 
in the report Is stored as a standard format. 



12. Insert navigational hyperlinks 

[0046] As each instance XML file Is being produced it 
is also possible to alter the referenced HTML files to add 
the hyperlinks between related data items. In this case 
it is necessary to save the altered HTML under a differ- 
ent name so as to avoid impacting the next iteration of 
this loop. For example, the output Instance for John and 
Jane consists of two logically related HTML fragments. 
The system will cause the text "Leeds' in the Store col- 
umn of fragment (B) to become a hyperlink which points 
to fragment (D). Therefore, when John is viewing frag- 
ment (B) in his web browser he can see that 'Leeds' is 
a hyperlink (it is underlined and the mouse pointer will 
change shape when passed over it, as is the norm for 
a hyperlink within an HTML document viewed in a web 
browser) - this is shown in Figure 3. 
[0047] Similarly the output instance for Tom (who has 
no profiles to be applied) will consist of four hyperlinked 
HTML files (see Figure 2). Section (A) will contain links 
to both sections (B) and (C), and section (B) will contain 
a link to section (D). 



7. A method according to claim 6, wherein the report 
is stored as Hypertext Markup Language (HTML), 
Rich Text Format (RTF) or Portable Document For- 
mat (PDF). 

8. A system for generating reports from a database in- 
cluding means for generating a master report from 
the database, ar\d means for generating from the 
master report one or more sub-reports Including a 
proportion of the master report based on predefined 
criteria. 

9. A system according to claim 8, including a computer 
memory In which the master report is stored. 

10. A system according to claim 8 or 9, wherein the 
master report is provided with a plurality data units 
each having associated therewith a marker, the 

55 sub-report or reports being generated by obtaining 

data units having markers matching or associated 
with said predefined criteria 
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11. A system according to ctaim 10. wherein an index 
of markers is created, said index being scanned for 
matches with said predetermined criteria. 

12. A system according to any one of claims 8 to 11, 5 
wherein the predefined criteria are stored in mem- 
ory and accessed to generate automatically sub-re- 
ports. 

10 
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